May 29, 2025

Intense longitudinal data (ILD)

Uniquely suited to provide detailed information on psychological and behavioral processes over time at the within-person level

  • Daily Diary
  • Ecological momentary assessment (EMA)
  • Experience sampling method (ESM)
  • Ambulatory assessments (AA)
  • Sensors

publications using ILD increased from ~250 to >2.500 between 2000 and 2023 (PubMed)



Analyzing ILD data

Most popular model: (multilevel) vector autoregressive (VAR) model

VAR model: stable process in which a system is perturbed by outside influences.

We study how quickly the process returns to the single equilibrium point (e.g., the mean).

Example: emotion regulation.


McNeish & Hamaker, 2019.

Analyzing ILD data

But, what if the process switches between multiple phases, associated with different experiences?

Conceptual framework: e.g., dynamical systems that switch between a number of equilibrium points

Example: depression, bipolar disorder, … .


Helmich et al., 2024

Analyzing ILD data

Empirical data underline \(\rightarrow\) a single equilibrium point does not always provide appropriate reflection of the data:

  • multimodality detected in weekly symptom data (Hosenfeld et al., 2015).
  • multimodality, skewness, and “pointmass” at scale end frequently detected in EMA emotion measurements (Haslbeck et al., 2022)


Modeling (latent) state dynamics

Advantages of modeling a process that switches between multiple (latent) categorical states:

  • Quantify the dynamics over time:
    • Probability of remaining in same mood state from one time point to the next
      • e.g., duration of a depressed mood state
    • Which mood state is an individual likely to transition to if they switch?
  • Empirically derived composition of (mood) states.
  • Derive the most likely sequence of mood states for each individual over time.




Empirical example



Capturing (suicidal) crisis dynamics

Capturing crisis dynamics using ESM data


  • 26 patients, 60 observations per patient per CAB factor

Capturing crisis dynamics using ESM data



Capturing crisis dynamics using ESM data



Capturing crisis dynamics using ESM data



Capturing crisis dynamics using ESM data



Theory driven preference for HMMs

In the empirical example on suicidal crisis states:

  • How long does someone tend to stay in a low or high suicidal crisis state?
  • From the low suicidal crisis state, does someone tend to switch directly to the highest suicidal crisis state, or (first) to an intermediate crisis state?
  • What is the composition of the empirically derived suicidal crisis states?
    • e.g., what makes a state a high crisis state or a low crisis state?
  • What is the sequence of suicidal crisis states over time for each individual?

The hidden Markov model

HMM: probabilistic model to infer hidden states \(S_t \in (1, 2, ..., m)\) at

  • each time point \(t = 1, 2, ..., T\)
  • we assume sequence of hidden states forms a Markov chain: \(P(S_{t_1} | S_t, S_{t-1}, ..., S_1 = P(S_{t+1} | S_t)\) \(\rightarrow\) the probability of the next state depends only on the current state

The hidden Markov model

Probability of the hidden states at each point in time are inferred using:

  • the multivariate observed series \((y_{k1}, y_{k2}, ..., y_{kT}; k \in 1, 2, ..., K)\)
  • we assume each observation \(y_{kt}\) is generated by an underlying, latent state \(S_t\)

The hidden Markov model

Two sets of parameters:

The hidden Markov model

Two sets of parameters:

  1. probability of transitioning between latent states \(\gamma_{ij}=P(S_{t+1} = j | S_t = i)\)
  2. probability of emitting an observation given the current latent state \(P(y_{kt} | S_t)\)

The hidden Markov model

Given example data, assume

  • Normal emission distribution: \(P(y_{kt} | S_t) \sim N(\mu_{ki}, \sigma_{ki}^2)\)
  • \(y_{k.}\) are conditionally independent given \(\{S_t : t = 1, 2, ..., T\}\) to accommodate multivariate data

Four CAB based crisis states



CAB crisis state dynamics


Personalized crisis trajectories


Capturing crisis dynamics – conclusion

Pairing fine grained ESM data with (multilevel) HMM, we

  • Identified four distinct and ascending CAB based crisis states
  • Quantified the dynamics of crisis at the group level
  • Visualized crisis states over time at the patient individual level




mHMMbayes



R package for multilevel hidden Markov models using Bayesian estimation

mHMMbayes

  • mHMMbayes fits multilevel hidden Markov models using Bayesian estimation
  • Allows for heterogeneity between subjects, while estimating one overall HMM
  • The model accommodates
    • continuous, categorical and count multivariate data
    • individual level covariates
    • missing data in the dependent variable(s)
  • includes functions for simulating data, obtaining the most likely hidden state sequence (Viterbi algorithm), and automated plotting.

mHMMbayes - Bayesian estimation

Advantages:

  • Bayesian methods yield many additional metrics, such as standard errors and credibility intervals.
  • Relatively easy to handle missing data (assuming MAR).
  • Allows incorporation of prior knowledge (through prior distribution), accommodates smaller sample sizes.
  • Adds flexibility, making it relatively easy to extend to more complex models.

mHMMbayes - Bayesian estimation

Disadvantages:

  • Label switching problem (more prominent in continuous and count data).
  • Solution:
    • sensible starting values
    • weakly informative priors
    • (hyper-prior values for) \(\bar{\mu}_i\) or \(\bar{\lambda}_i\) ordered in a sensible manner given data and process.

The mHMMbayes package: an example


26 patients, 60 observations per patient per CAB factor

Fitting a multilevel HMM – input

Fitting a 4 state multilevel hidden Markov model on the CAB crisis data:

library(mHMMbayes)

# Run a model without covariate(s) and default prior for gamma:
out_4st <- mHMM(s_data          = CAB_mHMM,           # Data to be used
                data_distr      = "continuous",       # Type of emission distribution
                gen             = list(...),          # General model properties
                start_val       = list(...),          # Starting values
                emiss_hyp_prior = ...,                # Parameter values of prior distributions
                mcmc            = list(J = J, burn_in = burn_in)) # MCMC settings

Fitting a multilevel HMM – general properties

## specifying general model properties:
 # number of states m
m <- 4
 # number of dependent variables
n_dep <- 5

gen <- list(m = m, n_dep = n_dep)

Fitting a multilevel HMM – starting values

Needed for the first iteration of the ‘forward algorithm’ \(\rightarrow\) determining the probability of each state for each point in time for each individual

Can be based on:

  • data driven approach \(\rightarrow\) fitting a single HMM on the completely pooled data (e.g., using DepmixS4).
  • theory driven \(\rightarrow\) given the process you are studying, the variables collected, and the scale of the variables, what are sensible
    • means (and SDs) for each of the states (e.g., a ‘good’ and ‘bad’ state),
    • transition probabilities.
  • combination \(\rightarrow\) combine theoretical considerations with information from (visual) exploration of the data.

Fitting a multilevel HMM – starting values

Starting values for the transition probability matrix

  • values on the diagonal represent ‘self-transitions’
  • off-diagonal values represent the probabilities to switch between states
  • usually, the diagonal is composed of larger values than the off-diagonal
  • each row needs to sum to 1
start_tpm <- matrix(c(..., ..., ..., ...,
                      ..., ..., ..., ...,
                      ..., ..., ..., ...,
                      ..., ..., ..., ...), byrow = TRUE, ncol = m, nrow = m)

Fitting a multilevel HMM – starting values

Starting values for the transition probability matrix

  • values on the diagonal represent ‘self-transitions’
  • off-diagonal values represent the probabilities to switch between states
  • usually, the diagonal is composed of larger values than the off-diagonal
  • each row needs to sum to 1
start_tpm <- matrix(c(0.70, 0.10, 0.10, 0.10,
                      0.10, 0.70, 0.10, 0.10,
                      0.10, 0.10, 0.70, 0.10,
                      0.10, 0.10, 0.10, 0.70), byrow = TRUE, ncol = m, nrow = m)
start_tpm <- matrix(c(0.91, 0.03, 0.03, 0.03,
                      0.03, 0.91, 0.03, 0.03,
                      0.03, 0.03, 0.91, 0.03,
                      0.03, 0.03, 0.03, 0.91), byrow = TRUE, ncol = m, nrow = m)

Fitting a multilevel HMM – starting values

Starting values for the (Gaussian) emission distribution

  • what are likely observations for each of the states (e.g., for a low and high crisis state)?
  • means and SDs need to be specified such that all observations within a variable receive reasonable support.
  • take negative correlations between variables into account!
    • E.g., high self-control likely goes together with low suicidal ideation.
  • list with one matrix per dependent variable, with 2 columns – mean and SD – and m rows.

Fitting a multilevel HMM – starting values

Starting values for the (Gaussian) emission distribution

start_emission <- list(matrix(c(..., ..., 
                                ..., ...,
                                ..., ..., 
                                ..., ...), byrow = TRUE, ncol = 2, nrow = m), # self-control
                       matrix(c(..., ..., 
                                ..., ...,
                                ..., ..., 
                                ..., ...), byrow = TRUE, ncol = 2, nrow = m), # negative-affect
                       matrix(c(..., ..., 
                                ..., ...,
                                ..., ..., 
                                ..., ...), byrow = TRUE, ncol = 2, nrow = m), # contact avoidance
                       ...

Fitting a multilevel HMM – starting values

Starting values for the (Gaussian) emission distribution


Fitting a multilevel HMM – starting values

Starting values for the (Gaussian) emission distribution

start_emission <- list(matrix(c(60, 15, 
                                40, 10,
                                20, 10, 
                                 3, 2), byrow = TRUE, ncol = 2, nrow = m), # self-control
                       matrix(c(20, 10, 
                                45, 10,
                                65, 10, 
                                82, 10), byrow = TRUE, ncol = 2, nrow = m), # negative-affect
                       matrix(c( 3, 2, 
                                15, 5,
                                40, 10, 
                                75, 10), byrow = TRUE, ncol = 2, nrow = m), # contact avoidance
                       ...

Fitting a multilevel HMM – emission prior

Specifying parameter values for the prior on the emission distribution is mandatory, as it helps with the label switching problem. We want to convey:

  • what are likely observations for each of the states (e.g., for a low and high crisis state)?
  • what is a likley amount of variation between individuals in the state means?
  • means and variances need to be specified such that all observations within a variable receive reasonable support.
  • take negative correlations between variables into account!
    • E.g., high self-control likely goes together with low suicidal ideation.
  • can re-use some of the values specified in the starting values!

Fitting a multilevel HMM – emission prior

Specified using the function prior_emiss_cont():

prior_emiss_cont(
  gen,              # general model properties m and n_dep
  
  # State means
  emiss_mu0,        # hyper-prior mean values of the Normal emission distribution 
  emiss_K0,         # number of hypothetical subjects emiss_mu0 is based on
  
  # Variance between individuals in state means
  emiss_V,          # degrees of freedom inverse Gamma hyper-prior on BETWEEN subject variance 
  emiss_nu,         # hyper-prior variance of Inverse Gamma on BETWEEN subject variance 
  
  # Variance within a state within individuals
  emiss_a0,         # shape values of IG hyper-prior on emission standard deviation^2 
  emiss_b0,         # rate values of IG hyper-prior on emission standard deviation^2 
)

More on IG hyper-prior on emission \(\sigma^2\)

Inverse Gamma hyper-prior on variance of Normal emission distribution:

Fitting a multilevel HMM – emission prior

Specified using the function prior_emiss_cont():

emiss_prior <- prior_emiss_cont(
  gen              = list(m = m, n_dep = n_dep),
  emiss_mu0,       = list(matrix(c(60, 40, 20,  3), nrow = 1), # self-control
                          matrix(c(20, 45, 65, 82), nrow = 1), # negative-affect
                          matrix(c( 3, 15, 40, 75), nrow = 1), # contact avoidance)
                          ... )
  emiss_K0         = rep(list((1)),n_dep),
  emiss_V,         = rep(list((1)),n_dep),
  emiss_nu,        = rep(list(rep(10^2, m)), n_dep),
  emiss_a0,        = rep(list(rep(1.5, m)), n_dep),
  emiss_b0,        = rep(list(rep(20, m)), n_dep))
)

Fitting a multilevel HMM

library(mHMMbayes)

# Run a model without covariate(s) and default prior for gamma:
out_4st <- mHMM(s_data          = CAB_mHMM,
                data_distr      = "continuous",,
                gen             = list(m = m, n_dep = n_dep),
                start_val       = c(list(start_tpm), start_emission),
                emiss_hyp_prior = emiss_prior,
                mcmc            = list(J = J, burn_in = burn_in))

Model output – print()

out_4st
## Number of subjects: 26 
## 
## 10000 iterations used in the MCMC algorithm with a burn in of 5000 
## Average Log likelihood over all subjects: -1256.123 
## Average AIC over all subjects:  2616.246 
## Average AICc over all subjects: 2993.319 
## 
## Number of states used: 4 
## 
## Number of dependent variables used: 5 
## 
## Type of dependent variable(s): continuous

Model output – summary()

summary(out_4st)
## State transition probability matrix 
##  (at the group level): 
##  
##              To state 1 To state 2 To state 3 To state 4
## From state 1      0.808      0.106      0.047      0.039
## From state 2      0.048      0.717      0.116      0.119
## From state 3      0.042      0.210      0.644      0.104
## From state 4      0.022      0.254      0.180      0.544
## 
##  
## Emission distribution ( continuous ) for each of the dependent variables 
##  (at the group level): 
##  
## $NegatiefAffect
##           Mean     SD
## State 1 31.648 13.321
## State 2 51.616 12.412
## State 3 59.418 14.625
## State 4 72.094 16.227
## 
## $Controle
##           Mean     SD
## State 1 43.719 14.819
## State 2 34.521 13.717
## State 3 32.868 17.035
## State 4 14.119  8.765
## 
## $Terugtrekken
##           Mean     SD
## State 1 17.333 15.906
## State 2 30.279 18.155
## State 3 39.081 25.899
## State 4 48.589 33.481
## 
## $BehoefteContact
##           Mean     SD
## State 1 41.840 17.458
## State 2 40.459 20.986
## State 3 42.033 25.729
## State 4  6.978  3.555
## 
## $Suicidaliteit
##           Mean     SD
## State 1  3.903  2.418
## State 2 48.334 16.300
## State 3 55.728 20.037
## State 4 65.546 17.589

Start practical 1

References

  • Aarts, E., Montagne, B., van der Meer, T. J., & Hagenaars, M. A. (2025). Capturing crisis dynamics: a novel personalized approach using multilevel hidden Markov modeling. Frontiers in Psychiatry, 15, 1501911. Doi: 10.3389/fpsyt.2024.1501911.
  • Haslbeck, J., Ryan, O., & Dablander, F. (2023). Multimodality and skewness in emotion time series. Emotion, 23(8), 2117. Doi: 10.1037/emo0001218.
  • Helmich, M. A., Schreuder, M. J., Bringmann, L. F., Riese, H., Snippe, E., & Smit, A. C. (2024). Slow down and be critical before using early warning signals in psychopathology. Nature Reviews Psychology, 3(11), 767-780. Doi: 10.1038/s44159-024-00369-y.
  • Hosenfeld, B., Bos, E. H., Wardenaar, K. J., Conradi, H. J., van der Maas, H. L., Visser, I., & de Jonge, P. (2015). Major depressive disorder as a nonlinear dynamic system: bimodality in the frequency distribution of depressive symptoms over time. BMC psychiatry, 15, 1-9.Doi: 10.1186/s12888-015-0596-5.
  • McNeish, D., & Hamaker, E. L. (2020). A primer on two-level dynamic structural equation models for intensive longitudinal data in Mplus. Psychological methods, 25(5), 610. Doi: 10.1037/met0000250.

Note: specifying (weakly) informative-priors – gamma

Transition probability matrix Gamma (prior specification is optional):

prior_gamma(
  m,                  # number of states in the model
  gamma_mu0,          # hyper-prior mean values on the Multinomial logit intercepts
  gamma_K0 = NULL,    # number of hypothetical subjects gamma_mu0 is based on
  gamma_nu = NULL,    # degrees of freedom inverse Wishart hyper-prior on covariance 
  gamma_V = NULL,     # hyper-prior variance-covariance matrix of Inverse Wishart 
)


prob_to_int(
  prob_matrix         # complete transition probability matrix gamma
)

mHMMbayes - Metropolis within Gibbs sampler

  1. Obtain forward probabilities \(\alpha_{t(ni)} = Pr\big(O_1 = o_1, ..., O_t = o_t, S_t = i\big)\) for each subject using current values of subject specific model parameters.
  2. Sample hidden state sequences in backward manner using \(\alpha_{t(ni)}\) and \(\gamma_{nij}\).
  3. Given current subject-specific parameters, draw new group-level parameter estimates using Gibbs step
  4. Given
    • observed event sequence for each subject
    • sampled hidden state sequence for each subject
    • group-level parameter distributions
    • draw new subject-specific parameters using RW Metropolis sampler
  5. Repeat step 1-4 a very large number of times